Overview

Brought to you by YData

Dataset statistics

Number of variables14
Number of observations28564
Missing cells1701
Missing cells (%)0.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.0 MiB
Average record size in memory624.5 B

Variable types

Text8
Categorical2
Numeric4

Alerts

class_name is highly overall correlated with collectionHigh correlation
collection is highly overall correlated with class_name and 1 other fieldsHigh correlation
latitude is highly overall correlated with longitudeHigh correlation
longitude is highly overall correlated with latitudeHigh correlation
rating is highly overall correlated with collectionHigh correlation
class_name is highly imbalanced (87.7%) Imbalance
latitude has 809 (2.8%) missing values Missing
longitude has 809 (2.8%) missing values Missing
filename has unique values Unique
rating has 7948 (27.8%) zeros Zeros

Reproduction

Analysis started2025-03-20 01:38:40.581826
Analysis finished2025-03-20 01:38:45.145854
Duration4.56 seconds
Software versionydata-profiling vv4.15.0
Download configurationconfig.json

Variables

Distinct206
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2025-03-19T20:38:45.432682image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.584687
Min length5

Characters and Unicode

Total characters188085
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1139490
2nd row1139490
3rd row1192948
4th row1192948
5th row1192948
ValueCountFrequency (%)
grekis 990
 
3.5%
compau 808
 
2.8%
trokin 787
 
2.8%
roahaw 709
 
2.5%
banana 610
 
2.1%
whtdov 572
 
2.0%
socfly1 543
 
1.9%
yeofly1 525
 
1.8%
bobfly1 514
 
1.8%
wbwwre1 499
 
1.7%
Other values (196) 22007
77.0%
2025-03-19T20:38:45.873204image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16891
 
9.0%
a 15584
 
8.3%
r 15121
 
8.0%
o 13677
 
7.3%
b 10847
 
5.8%
l 9891
 
5.3%
t 9544
 
5.1%
c 9280
 
4.9%
e 9157
 
4.9%
s 8204
 
4.4%
Other values (23) 69889
37.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 188085
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 16891
 
9.0%
a 15584
 
8.3%
r 15121
 
8.0%
o 13677
 
7.3%
b 10847
 
5.8%
l 9891
 
5.3%
t 9544
 
5.1%
c 9280
 
4.9%
e 9157
 
4.9%
s 8204
 
4.4%
Other values (23) 69889
37.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 188085
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 16891
 
9.0%
a 15584
 
8.3%
r 15121
 
8.0%
o 13677
 
7.3%
b 10847
 
5.8%
l 9891
 
5.3%
t 9544
 
5.1%
c 9280
 
4.9%
e 9157
 
4.9%
s 8204
 
4.4%
Other values (23) 69889
37.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 188085
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 16891
 
9.0%
a 15584
 
8.3%
r 15121
 
8.0%
o 13677
 
7.3%
b 10847
 
5.8%
l 9891
 
5.3%
t 9544
 
5.1%
c 9280
 
4.9%
e 9157
 
4.9%
s 8204
 
4.4%
Other values (23) 69889
37.2%
Distinct745
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2025-03-19T20:38:46.115166image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length115
Median length4
Mean length5.0811861
Min length2

Characters and Unicode

Total characters145139
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique550 ?
Unique (%)1.9%

Sample

1st row['']
2nd row['']
3rd row['']
4th row['']
5th row['']
ValueCountFrequency (%)
25910
86.8%
grekis 489
 
1.6%
whtdov 309
 
1.0%
trokin 188
 
0.6%
soulap1 114
 
0.4%
pirfly1 112
 
0.4%
rugdov 110
 
0.4%
banana 104
 
0.3%
saffin 102
 
0.3%
yebela1 96
 
0.3%
Other values (123) 2317
 
7.8%
2025-03-19T20:38:46.525238image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 59652
41.1%
[ 28564
19.7%
] 28564
19.7%
r 2271
 
1.6%
1 1987
 
1.4%
a 1873
 
1.3%
o 1845
 
1.3%
e 1404
 
1.0%
t 1396
 
1.0%
b 1372
 
0.9%
Other values (28) 16211
 
11.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 145139
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
' 59652
41.1%
[ 28564
19.7%
] 28564
19.7%
r 2271
 
1.6%
1 1987
 
1.4%
a 1873
 
1.3%
o 1845
 
1.3%
e 1404
 
1.0%
t 1396
 
1.0%
b 1372
 
0.9%
Other values (28) 16211
 
11.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 145139
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
' 59652
41.1%
[ 28564
19.7%
] 28564
19.7%
r 2271
 
1.6%
1 1987
 
1.4%
a 1873
 
1.3%
o 1845
 
1.3%
e 1404
 
1.0%
t 1396
 
1.0%
b 1372
 
0.9%
Other values (28) 16211
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 145139
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
' 59652
41.1%
[ 28564
19.7%
] 28564
19.7%
r 2271
 
1.6%
1 1987
 
1.4%
a 1873
 
1.3%
o 1845
 
1.3%
e 1404
 
1.0%
t 1396
 
1.0%
b 1372
 
0.9%
Other values (28) 16211
 
11.2%

type
Text

Distinct736
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
2025-03-19T20:38:46.841615image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length96
Median length8
Mean length8.8714116
Min length4

Characters and Unicode

Total characters253403
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique501 ?
Unique (%)1.8%

Sample

1st row['']
2nd row['']
3rd row['']
4th row['']
5th row['']
ValueCountFrequency (%)
song 11763
32.7%
call 10268
28.6%
8136
22.6%
flight 1429
 
4.0%
alarm 664
 
1.8%
calls 513
 
1.4%
duet 334
 
0.9%
dawn 211
 
0.6%
uncertain 171
 
0.5%
begging 144
 
0.4%
Other values (505) 2310
 
6.4%
2025-03-19T20:38:47.315585image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 64128
25.3%
[ 28564
11.3%
] 28564
11.3%
l 24267
 
9.6%
g 14298
 
5.6%
n 13932
 
5.5%
a 13701
 
5.4%
s 13278
 
5.2%
o 12730
 
5.0%
c 11534
 
4.6%
Other values (44) 28407
11.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 253403
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
' 64128
25.3%
[ 28564
11.3%
] 28564
11.3%
l 24267
 
9.6%
g 14298
 
5.6%
n 13932
 
5.5%
a 13701
 
5.4%
s 13278
 
5.2%
o 12730
 
5.0%
c 11534
 
4.6%
Other values (44) 28407
11.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 253403
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
' 64128
25.3%
[ 28564
11.3%
] 28564
11.3%
l 24267
 
9.6%
g 14298
 
5.6%
n 13932
 
5.5%
a 13701
 
5.4%
s 13278
 
5.2%
o 12730
 
5.0%
c 11534
 
4.6%
Other values (44) 28407
11.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 253403
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
' 64128
25.3%
[ 28564
11.3%
] 28564
11.3%
l 24267
 
9.6%
g 14298
 
5.6%
n 13932
 
5.5%
a 13701
 
5.4%
s 13278
 
5.2%
o 12730
 
5.0%
c 11534
 
4.6%
Other values (44) 28407
11.2%

filename
Text

Unique 

Distinct28564
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
2025-03-19T20:38:47.532546image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length23
Median length22
Mean length20.041101
Min length16

Characters and Unicode

Total characters572454
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28564 ?
Unique (%)100.0%

Sample

1st row1139490/CSA36385.ogg
2nd row1139490/CSA36389.ogg
3rd row1192948/CSA36358.ogg
4th row1192948/CSA36366.ogg
5th row1192948/CSA36373.ogg
ValueCountFrequency (%)
1192948/csa36373.ogg 1
 
< 0.1%
ywcpar/inat922688.ogg 1
 
< 0.1%
1139490/csa36385.ogg 1
 
< 0.1%
1139490/csa36389.ogg 1
 
< 0.1%
ywcpar/inat370907.ogg 1
 
< 0.1%
ywcpar/inat370911.ogg 1
 
< 0.1%
ywcpar/inat370914.ogg 1
 
< 0.1%
ywcpar/inat47452.ogg 1
 
< 0.1%
ywcpar/inat47453.ogg 1
 
< 0.1%
ywcpar/inat51530.ogg 1
 
< 0.1%
Other values (28554) 28554
> 99.9%
2025-03-19T20:38:48.407792image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
g 61874
 
10.8%
o 42241
 
7.4%
1 36458
 
6.4%
/ 28564
 
5.0%
. 28564
 
5.0%
a 22782
 
4.0%
C 21366
 
3.7%
X 21204
 
3.7%
2 19828
 
3.5%
4 18830
 
3.3%
Other values (30) 270743
47.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 572454
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
g 61874
 
10.8%
o 42241
 
7.4%
1 36458
 
6.4%
/ 28564
 
5.0%
. 28564
 
5.0%
a 22782
 
4.0%
C 21366
 
3.7%
X 21204
 
3.7%
2 19828
 
3.5%
4 18830
 
3.3%
Other values (30) 270743
47.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 572454
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
g 61874
 
10.8%
o 42241
 
7.4%
1 36458
 
6.4%
/ 28564
 
5.0%
. 28564
 
5.0%
a 22782
 
4.0%
C 21366
 
3.7%
X 21204
 
3.7%
2 19828
 
3.5%
4 18830
 
3.3%
Other values (30) 270743
47.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 572454
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
g 61874
 
10.8%
o 42241
 
7.4%
1 36458
 
6.4%
/ 28564
 
5.0%
. 28564
 
5.0%
a 22782
 
4.0%
C 21366
 
3.7%
X 21204
 
3.7%
2 19828
 
3.5%
4 18830
 
3.3%
Other values (30) 270743
47.3%

collection
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
XC
21204 
iNat
7198 
CSA
 
162

Length

Max length4
Median length2
Mean length2.5096625
Min length2

Characters and Unicode

Total characters71686
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCSA
2nd rowCSA
3rd rowCSA
4th rowCSA
5th rowCSA

Common Values

ValueCountFrequency (%)
XC 21204
74.2%
iNat 7198
 
25.2%
CSA 162
 
0.6%

Length

2025-03-19T20:38:48.527723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-19T20:38:48.612440image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
xc 21204
74.2%
inat 7198
 
25.2%
csa 162
 
0.6%

Most occurring characters

ValueCountFrequency (%)
C 21366
29.8%
X 21204
29.6%
i 7198
 
10.0%
N 7198
 
10.0%
a 7198
 
10.0%
t 7198
 
10.0%
S 162
 
0.2%
A 162
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 71686
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C 21366
29.8%
X 21204
29.6%
i 7198
 
10.0%
N 7198
 
10.0%
a 7198
 
10.0%
t 7198
 
10.0%
S 162
 
0.2%
A 162
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 71686
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C 21366
29.8%
X 21204
29.6%
i 7198
 
10.0%
N 7198
 
10.0%
a 7198
 
10.0%
t 7198
 
10.0%
S 162
 
0.2%
A 162
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 71686
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C 21366
29.8%
X 21204
29.6%
i 7198
 
10.0%
N 7198
 
10.0%
a 7198
 
10.0%
t 7198
 
10.0%
S 162
 
0.2%
A 162
 
0.2%

rating
Real number (ℝ)

High correlation  Zeros 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9170634
Minimum0
Maximum5
Zeros7948
Zeros (%)27.8%
Negative0
Negative (%)0.0%
Memory size223.3 KiB
2025-03-19T20:38:48.694681image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q34.5
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)4.5

Descriptive statistics

Standard deviation1.9648962
Coefficient of variation (CV)0.67358708
Kurtosis-1.3038216
Mean2.9170634
Median Absolute Deviation (MAD)1
Skewness-0.58157324
Sum83323
Variance3.8608173
MonotonicityNot monotonic
2025-03-19T20:38:48.795367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 7948
27.8%
4 7582
26.5%
5 6556
23.0%
3 2886
 
10.1%
4.5 1261
 
4.4%
3.5 895
 
3.1%
2 752
 
2.6%
2.5 360
 
1.3%
1 228
 
0.8%
1.5 70
 
0.2%
ValueCountFrequency (%)
0 7948
27.8%
0.5 26
 
0.1%
1 228
 
0.8%
1.5 70
 
0.2%
2 752
 
2.6%
2.5 360
 
1.3%
3 2886
 
10.1%
3.5 895
 
3.1%
4 7582
26.5%
4.5 1261
 
4.4%
ValueCountFrequency (%)
5 6556
23.0%
4.5 1261
 
4.4%
4 7582
26.5%
3.5 895
 
3.1%
3 2886
 
10.1%
2.5 360
 
1.3%
2 752
 
2.6%
1.5 70
 
0.2%
1 228
 
0.8%
0.5 26
 
0.1%

latitude
Real number (ℝ)

High correlation  Missing 

Distinct11042
Distinct (%)39.8%
Missing809
Missing (%)2.8%
Infinite0
Infinite (%)0.0%
Mean-0.53348049
Minimum-54.8574
Maximum68.3748
Zeros0
Zeros (%)0.0%
Negative12956
Negative (%)45.4%
Memory size223.3 KiB
2025-03-19T20:38:48.922493image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-54.8574
5-th percentile-28.9361
Q1-15.0846
median1.1316
Q39.511
95-th percentile28.5785
Maximum68.3748
Range123.2322
Interquartile range (IQR)24.5956

Descriptive statistics

Standard deviation17.609276
Coefficient of variation (CV)-33.008285
Kurtosis-0.056935472
Mean-0.53348049
Median Absolute Deviation (MAD)10.6881
Skewness0.16107522
Sum-14806.751
Variance310.08661
MonotonicityNot monotonic
2025-03-19T20:38:49.078649image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.351 276
 
1.0%
-22.4508 225
 
0.8%
-14.625 187
 
0.7%
-16.5631 186
 
0.7%
3.5026 165
 
0.6%
0.883 142
 
0.5%
-16.6003 112
 
0.4%
5.2461 104
 
0.4%
-2.9666 104
 
0.4%
1.4898 100
 
0.4%
Other values (11032) 26154
91.6%
(Missing) 809
 
2.8%
ValueCountFrequency (%)
-54.8574 1
 
< 0.1%
-54.8378 4
< 0.1%
-54.21 1
 
< 0.1%
-53.1702 1
 
< 0.1%
-52.7719 1
 
< 0.1%
-50.5001 1
 
< 0.1%
-50.3334 1
 
< 0.1%
-50.325 1
 
< 0.1%
-49.3039 1
 
< 0.1%
-47.241 1
 
< 0.1%
ValueCountFrequency (%)
68.3748 1
< 0.1%
64.9118 1
< 0.1%
64.8907 1
< 0.1%
63.964 1
< 0.1%
62.014 1
< 0.1%
61.6122 1
< 0.1%
61.3953 1
< 0.1%
60.85 1
< 0.1%
60.8227 1
< 0.1%
60.6096 1
< 0.1%

longitude
Real number (ℝ)

High correlation  Missing 

Distinct11048
Distinct (%)39.8%
Missing809
Missing (%)2.8%
Infinite0
Infinite (%)0.0%
Mean-69.496887
Minimum-163.68
Maximum-0.0322
Zeros0
Zeros (%)0.0%
Negative27755
Negative (%)97.2%
Memory size223.3 KiB
2025-03-19T20:38:49.225186image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-163.68
5-th percentile-98.74451
Q1-79.6941
median-73.6252
Q3-54.11215
95-th percentile-42.7167
Maximum-0.0322
Range163.6478
Interquartile range (IQR)25.58195

Descriptive statistics

Standard deviation18.247133
Coefficient of variation (CV)-0.26256044
Kurtosis0.42292021
Mean-69.496887
Median Absolute Deviation (MAD)12.3585
Skewness-0.1092373
Sum-1928886.1
Variance332.95787
MonotonicityNot monotonic
2025-03-19T20:38:49.378222image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-74.652 276
 
1.0%
-42.7735 225
 
0.8%
-49.0051 187
 
0.7%
-49.285 186
 
0.7%
-76.3552 165
 
0.6%
-78.8 130
 
0.5%
-49.2802 112
 
0.4%
-60.7402 104
 
0.4%
-75.6853 102
 
0.4%
-49.604 99
 
0.3%
Other values (11038) 26169
91.6%
(Missing) 809
 
2.8%
ValueCountFrequency (%)
-163.68 1
< 0.1%
-158.1053 1
< 0.1%
-157.9532 1
< 0.1%
-157.9461 1
< 0.1%
-157.816 1
< 0.1%
-155.9903 1
< 0.1%
-155.9732 1
< 0.1%
-155.9666 1
< 0.1%
-155.9288 1
< 0.1%
-155.8359 1
< 0.1%
ValueCountFrequency (%)
-0.0322 1
 
< 0.1%
-0.3004 1
 
< 0.1%
-0.3093 1
 
< 0.1%
-0.3245 1
 
< 0.1%
-0.4584 2
< 0.1%
-0.547 2
< 0.1%
-0.5918 1
 
< 0.1%
-0.672 3
< 0.1%
-0.6857 1
 
< 0.1%
-0.6887 1
 
< 0.1%
Distinct206
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
2025-03-19T20:38:49.685167image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length28
Median length24
Mean length19.017715
Min length9

Characters and Unicode

Total characters543222
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRagoniella pulchella
2nd rowRagoniella pulchella
3rd rowOxyprora surinamensis
4th rowOxyprora surinamensis
5th rowOxyprora surinamensis
ValueCountFrequency (%)
flaveola 1029
 
1.8%
sulphuratus 990
 
1.7%
pitangus 990
 
1.7%
tyrannus 867
 
1.5%
albicollis 808
 
1.4%
nyctidromus 808
 
1.4%
melancholicus 787
 
1.4%
myiozetetes 779
 
1.4%
tolmomyias 713
 
1.2%
magnirostris 709
 
1.2%
Other values (353) 48565
85.1%
2025-03-19T20:38:50.136722image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 61253
 
11.3%
s 45925
 
8.5%
i 44933
 
8.3%
u 35167
 
6.5%
o 34366
 
6.3%
e 33790
 
6.2%
r 32144
 
5.9%
l 30834
 
5.7%
28487
 
5.2%
n 27476
 
5.1%
Other values (39) 168847
31.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 543222
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 61253
 
11.3%
s 45925
 
8.5%
i 44933
 
8.3%
u 35167
 
6.5%
o 34366
 
6.3%
e 33790
 
6.2%
r 32144
 
5.9%
l 30834
 
5.7%
28487
 
5.2%
n 27476
 
5.1%
Other values (39) 168847
31.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 543222
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 61253
 
11.3%
s 45925
 
8.5%
i 44933
 
8.3%
u 35167
 
6.5%
o 34366
 
6.3%
e 33790
 
6.2%
r 32144
 
5.9%
l 30834
 
5.7%
28487
 
5.2%
n 27476
 
5.1%
Other values (39) 168847
31.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 543222
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 61253
 
11.3%
s 45925
 
8.5%
i 44933
 
8.3%
u 35167
 
6.5%
o 34366
 
6.3%
e 33790
 
6.2%
r 32144
 
5.9%
l 30834
 
5.7%
28487
 
5.2%
n 27476
 
5.1%
Other values (39) 168847
31.1%
Distinct206
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
2025-03-19T20:38:50.365040image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length31
Median length25
Mean length18.234736
Min length6

Characters and Unicode

Total characters520857
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRagoniella pulchella
2nd rowRagoniella pulchella
3rd rowOxyprora surinamensis
4th rowOxyprora surinamensis
5th rowOxyprora surinamensis
ValueCountFrequency (%)
flycatcher 3054
 
5.3%
tropical 1654
 
2.9%
great 1587
 
2.8%
common 1561
 
2.7%
southern 1077
 
1.9%
kiskadee 990
 
1.7%
hawk 924
 
1.6%
dove 815
 
1.4%
pauraque 808
 
1.4%
kingbird 787
 
1.4%
Other values (302) 44123
76.9%
2025-03-19T20:38:50.756138image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 55026
 
10.6%
a 44752
 
8.6%
r 34794
 
6.7%
o 31113
 
6.0%
l 30197
 
5.8%
28822
 
5.5%
i 27888
 
5.4%
t 27497
 
5.3%
d 23920
 
4.6%
n 23681
 
4.5%
Other values (41) 193167
37.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 520857
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 55026
 
10.6%
a 44752
 
8.6%
r 34794
 
6.7%
o 31113
 
6.0%
l 30197
 
5.8%
28822
 
5.5%
i 27888
 
5.4%
t 27497
 
5.3%
d 23920
 
4.6%
n 23681
 
4.5%
Other values (41) 193167
37.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 520857
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 55026
 
10.6%
a 44752
 
8.6%
r 34794
 
6.7%
o 31113
 
6.0%
l 30197
 
5.8%
28822
 
5.5%
i 27888
 
5.4%
t 27497
 
5.3%
d 23920
 
4.6%
n 23681
 
4.5%
Other values (41) 193167
37.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 520857
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 55026
 
10.6%
a 44752
 
8.6%
r 34794
 
6.7%
o 31113
 
6.0%
l 30197
 
5.8%
28822
 
5.5%
i 27888
 
5.4%
t 27497
 
5.3%
d 23920
 
4.6%
n 23681
 
4.5%
Other values (41) 193167
37.1%

genus
Text

Distinct174
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
2025-03-19T20:38:51.023051image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length17
Median length14
Mean length9.1125893
Min length3

Characters and Unicode

Total characters260292
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRagoniella
2nd rowRagoniella
3rd rowOxyprora
4th rowOxyprora
5th rowOxyprora
ValueCountFrequency (%)
pitangus 990
 
3.5%
tyrannus 867
 
3.0%
nyctidromus 808
 
2.8%
myiozetetes 779
 
2.7%
tolmomyias 713
 
2.5%
rupornis 709
 
2.5%
coereba 610
 
2.1%
leptotila 572
 
2.0%
setophaga 564
 
2.0%
megarynchus 514
 
1.8%
Other values (164) 21438
75.1%
2025-03-19T20:38:51.429590image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 27302
 
10.5%
o 22181
 
8.5%
s 19604
 
7.5%
i 18430
 
7.1%
e 18046
 
6.9%
r 17961
 
6.9%
t 14978
 
5.8%
n 13134
 
5.0%
u 13040
 
5.0%
l 10919
 
4.2%
Other values (35) 84697
32.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 260292
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 27302
 
10.5%
o 22181
 
8.5%
s 19604
 
7.5%
i 18430
 
7.1%
e 18046
 
6.9%
r 17961
 
6.9%
t 14978
 
5.8%
n 13134
 
5.0%
u 13040
 
5.0%
l 10919
 
4.2%
Other values (35) 84697
32.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 260292
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 27302
 
10.5%
o 22181
 
8.5%
s 19604
 
7.5%
i 18430
 
7.1%
e 18046
 
6.9%
r 17961
 
6.9%
t 14978
 
5.8%
n 13134
 
5.0%
u 13040
 
5.0%
l 10919
 
4.2%
Other values (35) 84697
32.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 260292
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 27302
 
10.5%
o 22181
 
8.5%
s 19604
 
7.5%
i 18430
 
7.1%
e 18046
 
6.9%
r 17961
 
6.9%
t 14978
 
5.8%
n 13134
 
5.0%
u 13040
 
5.0%
l 10919
 
4.2%
Other values (35) 84697
32.5%
Distinct195
Distinct (%)0.7%
Missing83
Missing (%)0.3%
Memory size1.6 MiB
2025-03-19T20:38:51.671316image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length15
Median length12
Mean length8.9337804
Min length3

Characters and Unicode

Total characters254443
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowpulchella
2nd rowpulchella
3rd rowsurinamensis
4th rowsurinamensis
5th rowsurinamensis
ValueCountFrequency (%)
flaveola 1029
 
3.6%
sulphuratus 990
 
3.5%
albicollis 808
 
2.8%
melancholicus 787
 
2.8%
magnirostris 709
 
2.5%
verreauxi 572
 
2.0%
similis 543
 
1.9%
sulphurescens 525
 
1.8%
pitangua 514
 
1.8%
leucosticta 499
 
1.8%
Other values (185) 21505
75.5%
2025-03-19T20:38:52.046188image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 33951
13.3%
i 26503
10.4%
s 26321
10.3%
u 22127
8.7%
l 19915
 
7.8%
c 15867
 
6.2%
e 15744
 
6.2%
n 14342
 
5.6%
r 14183
 
5.6%
o 12185
 
4.8%
Other values (15) 53305
20.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 254443
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 33951
13.3%
i 26503
10.4%
s 26321
10.3%
u 22127
8.7%
l 19915
 
7.8%
c 15867
 
6.2%
e 15744
 
6.2%
n 14342
 
5.6%
r 14183
 
5.6%
o 12185
 
4.8%
Other values (15) 53305
20.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 254443
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 33951
13.3%
i 26503
10.4%
s 26321
10.3%
u 22127
8.7%
l 19915
 
7.8%
c 15867
 
6.2%
e 15744
 
6.2%
n 14342
 
5.6%
r 14183
 
5.6%
o 12185
 
4.8%
Other values (15) 53305
20.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 254443
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 33951
13.3%
i 26503
10.4%
s 26321
10.3%
u 22127
8.7%
l 19915
 
7.8%
c 15867
 
6.2%
e 15744
 
6.2%
n 14342
 
5.6%
r 14183
 
5.6%
o 12185
 
4.8%
Other values (15) 53305
20.9%

inat_taxon_id
Real number (ℝ)

Distinct206
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79030.885
Minimum519
Maximum1564122
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size223.3 KiB
2025-03-19T20:38:52.160891image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum519
5-th percentile2062
Q18479
median16737
Q319788
95-th percentile513889
Maximum1564122
Range1563603
Interquartile range (IQR)11309

Descriptive statistics

Standard deviation207306.13
Coefficient of variation (CV)2.6231027
Kurtosis23.258045
Mean79030.885
Median Absolute Deviation (MAD)6873
Skewness4.542897
Sum2.2574382 × 109
Variance4.2975832 × 1010
MonotonicityNot monotonic
2025-03-19T20:38:52.305836image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16956 990
 
3.5%
19627 808
 
2.8%
16787 787
 
2.8%
201041 709
 
2.5%
10199 610
 
2.1%
3280 572
 
2.0%
16842 543
 
1.9%
16567 525
 
1.8%
16737 514
 
1.8%
7613 499
 
1.7%
Other values (196) 22007
77.0%
ValueCountFrequency (%)
519 147
 
0.5%
1300 50
 
0.2%
1468 261
0.9%
1538 17
 
0.1%
1593 55
 
0.2%
1970 127
 
0.4%
1971 287
1.0%
1989 431
1.5%
2050 20
 
0.1%
2062 77
 
0.3%
ValueCountFrequency (%)
1564122 6
 
< 0.1%
1462737 7
 
< 0.1%
1462711 3
 
< 0.1%
1432779 238
0.8%
1346504 5
 
< 0.1%
1289646 15
 
0.1%
1289601 90
 
0.3%
1286908 85
 
0.3%
1194042 3
 
< 0.1%
1192948 4
 
< 0.1%

class_name
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Aves
27648 
Amphibia
 
583
Mammalia
 
178
Insecta
 
155

Length

Max length8
Median length4
Mean length4.1228469
Min length4

Characters and Unicode

Total characters117765
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowInsecta
2nd rowInsecta
3rd rowInsecta
4th rowInsecta
5th rowInsecta

Common Values

ValueCountFrequency (%)
Aves 27648
96.8%
Amphibia 583
 
2.0%
Mammalia 178
 
0.6%
Insecta 155
 
0.5%

Length

2025-03-19T20:38:52.443910image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-19T20:38:52.528567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
aves 27648
96.8%
amphibia 583
 
2.0%
mammalia 178
 
0.6%
insecta 155
 
0.5%

Most occurring characters

ValueCountFrequency (%)
A 28231
24.0%
e 27803
23.6%
s 27803
23.6%
v 27648
23.5%
i 1344
 
1.1%
a 1272
 
1.1%
m 939
 
0.8%
p 583
 
0.5%
h 583
 
0.5%
b 583
 
0.5%
Other values (6) 976
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 117765
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 28231
24.0%
e 27803
23.6%
s 27803
23.6%
v 27648
23.5%
i 1344
 
1.1%
a 1272
 
1.1%
m 939
 
0.8%
p 583
 
0.5%
h 583
 
0.5%
b 583
 
0.5%
Other values (6) 976
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 117765
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 28231
24.0%
e 27803
23.6%
s 27803
23.6%
v 27648
23.5%
i 1344
 
1.1%
a 1272
 
1.1%
m 939
 
0.8%
p 583
 
0.5%
h 583
 
0.5%
b 583
 
0.5%
Other values (6) 976
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 117765
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 28231
24.0%
e 27803
23.6%
s 27803
23.6%
v 27648
23.5%
i 1344
 
1.1%
a 1272
 
1.1%
m 939
 
0.8%
p 583
 
0.5%
h 583
 
0.5%
b 583
 
0.5%
Other values (6) 976
 
0.8%

Interactions

2025-03-19T20:38:44.075603image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:42.642466image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.120202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.599989image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:44.196248image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:42.753260image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.233893image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.716571image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:44.325436image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:42.872580image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.349752image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.827629image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:44.453102image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:42.991844image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.473918image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-03-19T20:38:43.947009image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-03-19T20:38:52.595182image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
class_namecollectioninat_taxon_idlatitudelongituderating
class_name1.0000.7060.2090.1820.0950.130
collection0.7061.0000.2410.2650.1900.669
inat_taxon_id0.2090.2411.0000.0330.006-0.040
latitude0.1820.2650.0331.000-0.765-0.119
longitude0.0950.1900.006-0.7651.0000.101
rating0.1300.669-0.040-0.1190.1011.000

Missing values

2025-03-19T20:38:44.649747image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-19T20:38:44.830539image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-03-19T20:38:45.040593image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

primary_labelsecondary_labelstypefilenamecollectionratinglatitudelongitudescientific_namecommon_namegenusspeciesinat_taxon_idclass_name
01139490['']['']1139490/CSA36385.oggCSA0.07.3206-73.7128Ragoniella pulchellaRagoniella pulchellaRagoniellapulchella1139490Insecta
11139490['']['']1139490/CSA36389.oggCSA0.07.3206-73.7128Ragoniella pulchellaRagoniella pulchellaRagoniellapulchella1139490Insecta
21192948['']['']1192948/CSA36358.oggCSA0.07.3791-73.7313Oxyprora surinamensisOxyprora surinamensisOxyprorasurinamensis1192948Insecta
31192948['']['']1192948/CSA36366.oggCSA0.07.2800-73.8582Oxyprora surinamensisOxyprora surinamensisOxyprorasurinamensis1192948Insecta
41192948['']['']1192948/CSA36373.oggCSA0.07.3791-73.7313Oxyprora surinamensisOxyprora surinamensisOxyprorasurinamensis1192948Insecta
51192948['']['']1192948/CSA36388.oggCSA0.07.3791-73.7313Oxyprora surinamensisOxyprora surinamensisOxyprorasurinamensis1192948Insecta
61194042['']['']1194042/CSA18783.oggCSA0.05.7892-73.5504Copiphora colombiaeCopiphora colombiaeCopiphoracolombiae1194042Insecta
71194042['']['']1194042/CSA18794.oggCSA0.05.7892-73.5504Copiphora colombiaeCopiphora colombiaeCopiphoracolombiae1194042Insecta
81194042['']['']1194042/CSA18802.oggCSA0.05.7892-73.5504Copiphora colombiaeCopiphora colombiaeCopiphoracolombiae1194042Insecta
9126247['65448', '22976', '476538']['advertisement call']126247/XC941297.oggXC3.59.0465-79.3024Leptodactylus insularumSpotted Foam-nest FrogLeptodactylusinsularum126247Amphibia
primary_labelsecondary_labelstypefilenamecollectionratinglatitudelongitudescientific_namecommon_namegenusspeciesinat_taxon_idclass_name
28554ywcpar['']['']ywcpar/iNat612649.oggiNat0.0-6.1902-50.0840Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28555ywcpar['']['']ywcpar/iNat681484.oggiNat0.06.3189-75.5545Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28556ywcpar['']['']ywcpar/iNat681488.oggiNat0.06.2883-75.4435Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28557ywcpar['']['']ywcpar/iNat681523.oggiNat0.06.3008-75.4589Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28558ywcpar['']['']ywcpar/iNat742782.oggiNat0.09.2447-70.3002Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28559ywcpar['']['']ywcpar/iNat77392.oggiNat0.07.6921-80.3379Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28560ywcpar['']['']ywcpar/iNat78624.oggiNat0.08.9918-79.4877Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28561ywcpar['']['']ywcpar/iNat789234.oggiNat0.09.2316-70.2041Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28562ywcpar['']['']ywcpar/iNat819873.oggiNat0.010.5838-66.8545Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves
28563ywcpar['']['']ywcpar/iNat922688.oggiNat0.09.1156-79.4907Amazona ochrocephalaYellow-crowned ParrotAmazonaochrocephala19003Aves